Extending Decision Tree Clasifiers for Uncertain Data
نویسندگان
چکیده
Traditionally, decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include measurement/quantization errors, data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution. Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision treeing uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision tree classifier can be much improved if the “complete information” of a data item (taking into account the probability density function (pdf)) is utilized. [1] We extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted that show that the resulting classifiers are more accurate than those using value averages. Since processing pdf’s is computationally more costly than processing single values (e.g., averages). ---------------------------------------------------------------------***-------------------------------------------------------------------------
منابع مشابه
Classification: A Decision Tree For Uncertain Data Using CDF
The Decision trees are suitable and widely used for describing classification phenomena. This paper present a decision tree based classification system for uncertain data. The uncertain data means lack of certainty. Data uncertainty comes by different parameters including sensor error, network latency measurements precision limitation and multiple repeated measurements. We find that decision tr...
متن کاملClassification of Categorical Uncertain Data Using Decision Tree
Certain data is a data whose values are known precisely whereas uncertain data means whose value are not known precisely. But data is always uncertain in real life applications. In data uncertainty attribute value is represented by a set of values. There are two types of attributes in data sets namely, numerical and categorical attributes. Data uncertainty can arise in both numerical and catego...
متن کاملDTU: A Decision Tree for Uncertain Data
Decision Tree is a widely used data classification technique. This paper proposes a decision tree based classification method on uncertain data. Data uncertainty is common in emerging applications, such as sensor networks, moving object databases, medical and biological bases. Data uncertainty can be caused by various factors including measurements precision limitation, outdated sources, sensor...
متن کاملResearch on Dynamic Cost-sensitive Decision Tree for Mining Uncertain Data Based on the Genetic Algorithm
The existing classifiers for uncertain data don’t consider the dynamic cost, so this paper proposes the classification approach of the dynamic cost-sensitive decision tree for uncertain data based on the genetic algorithm (GDCDTU) , which overcomes the limitations of the stationary cost, and searches automatically the suitable cost space of every sub datasets. Firstly, this paper gives the dyna...
متن کاملPerformance Analysis on Uncertain Data using Decision Tree
Data uncertainty is common in emerging applications, such as sensor networks, moving object databases, medical and biological fields. Data uncertainty can be caused by various factors including measurements precision limitation. Data uncertainty is inherited in various applications due to different reasons such as outdated sources or imprecise measurement and transmission problems. Classificati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012